Goto

Collaborating Authors

 physical review letter


More Bang for the Buck: Improving the Inference of Large Language Models at a Fixed Budget using Reset and Discard (ReD)

Meir, Sagi, Keidar, Tommer D., Levi, Noam, Reuveni, Shlomi, Hirshberg, Barak

arXiv.org Machine Learning

The performance of large language models (LLMs) on verifiable tasks is usually measured by pass@k, the probability of answering a question correctly at least once in k trials. At a fixed budget, a more suitable metric is coverage@cost, the average number of unique questions answered as a function of the total number of attempts. We connect the two metrics and show that the empirically-observed power-law behavior in pass@k leads to a sublinear growth of the coverage@cost (diminishing returns). To solve this problem, we propose Reset-and-Discard (ReD), a query method of LLMs that increases coverage@cost for any given budget, regardless of the pass@k form. Moreover, given a pass@k, we can quantitatively predict the savings in the total number of attempts using ReD. If pass@k is not available for the model, ReD can infer its power-law exponent. Experiments on three LLMs using HumanEval demonstrate that ReD substantially reduces the required attempts, tokens, and USD cost to reach a desired coverage, while also offering an efficient way to measure inference power-laws.


Optimal certification of constant-local Hamiltonians

Lee, Junseo, Shin, Myeongjin

arXiv.org Artificial Intelligence

We study the problem of certifying local Hamiltonians from real-time access to their dynamics. Given oracle access to $e^{-itH}$ for an unknown $k$-local Hamiltonian $H$ and a fully specified target Hamiltonian $H_0$, the goal is to decide whether $H$ is exactly equal to $H_0$ or differs from $H_0$ by at least $\varepsilon$ in normalized Frobenius norm, while minimizing the total evolution time. We introduce the first intolerant Hamiltonian certification protocol that achieves optimal performance for all constant-locality Hamiltonians. For general $n$-qubit, $k$-local, traceless Hamiltonians, our procedure uses $O(c^k/\varepsilon)$ total evolution time for a universal constant $c$, and succeeds with high probability. In particular, for $O(1)$-local Hamiltonians, the total evolution time becomes $Θ(1/\varepsilon)$, matching the known $Ω(1/\varepsilon)$ lower bounds and achieving the gold-standard Heisenberg-limit scaling. Prior certification methods either relied on implementing inverse evolution of $H$, required controlled access to $e^{-itH}$, or achieved near-optimal guarantees only in restricted settings such as the Ising case ($k=2$). In contrast, our algorithm requires neither inverse evolution nor controlled operations: it uses only forward real-time dynamics and achieves optimal intolerant certification for all constant-locality Hamiltonians.


HPC-Driven Modeling with ML-Based Surrogates for Magnon-Photon Dynamics in Hybrid Quantum Systems

Song, Jialin, Tang, Yingheng, Ren, Pu, Takayoshi, Shintaro, Sawant, Saurabh, Zhu, Yujie, Hu, Jia-Mian, Nonaka, Andy, Mahoney, Michael W., Erichson, Benjamin, Yao, Zhi

arXiv.org Artificial Intelligence

Simulating hybrid magnonic quantum systems remains a challenge due to the large disparity between the timescales of the two systems. We present a massively parallel GPU-based simulation framework that enables fully coupled, large-scale modeling of on-chip magnon-photon circuits. T o accelerate design workflows, we develop a physics-informed machine learning surrogate trained on the simulation data, reducing computational cost while maintaining accuracy. This combined approach reveals real-time energy exchange dynamics and reproduces key phenomena such as anti-crossing behavior and the suppression of ferromagnetic resonance under strong electromagnetic fields. By addressing the multiscale and multiphysics challenges in magnon-photon modeling, our framework enables scalable simulation and rapid prototyping of next-generation quantum and spintronic devices. 1 Introduction Hybrid quantum systems, which combine distinct physical platforms, are a promising route toward advanced quantum technologies, as they harness strong interactions that may not be readily achievable in a single platform [1, 2]. These systems take many forms, coupling any two (or more) quantum platforms -- for example, superconducting qubits [3, 4], microwave resonators [5], single spins [6], spin ensembles [4, 7-9], or mechanical resonators [10-12] -- to harness strong interactions. These heterogeneous systems leverage complementary advantages of each component, but their rich multi-physics interactions pose formidable modeling challenges. A prominent example is cavity magnonics, where collective spin excitations (magnons) couple with microwave photons in a resonant cavity to form hybrid magnon-polariton modes when tuned into resonance [13-15]. These states are essential for quantum operations such as mode swapping [16, 17], quantum state storage [4, 18, 19], and dynamic control of energy exchange [19, 20]. The hallmark experimental signature of strong magnon-photon coupling is a pronounced avoided crossing (mode splitting) in the frequency spectrum, in agreement with theoretical predictions [21] and observed in many 3D [13, 22] and on-chip 2D [7, 8, 23] cavity based systems.


ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation

Neural Information Processing Systems

Neural TensorNet parameterizes normalized wavefunctions, allows for exact sampling, generalizes the expressivity of tensor networks and autoregressive neural networks, and inherits a variety of symmetries from autoregressive neural networks.


Model-free learning of probability flows: Elucidating the nonequilibrium dynamics of flocking

Boffi, Nicholas M., Vanden-Eijnden, Eric

arXiv.org Artificial Intelligence

Active systems comprise a class of nonequilibrium dynamics in which individual components autonomously dissipate energy. Efforts towards understanding the role played by activity have centered on computation of the entropy production rate (EPR), which quantifies the breakdown of time reversal symmetry. A fundamental difficulty in this program is that high dimensionality of the phase space renders traditional computational techniques infeasible for estimating the EPR. Here, we overcome this challenge with a novel deep learning approach that estimates probability currents directly from stochastic system trajectories. We derive a new physical connection between the probability current and two local definitions of the EPR for inertial systems, which we apply to characterize the departure from equilibrium in a canonical model of flocking. Our results highlight that entropy is produced and consumed on the spatial interface of a flock as the interplay between alignment and fluctuation dynamically creates and annihilates order. By enabling the direct visualization of when and where a given system is out of equilibrium, we anticipate that our methodology will advance the understanding of a broad class of complex nonequilibrium dynamics.


Which bits went where? Past and future transfer entropy decomposition with the information bottleneck

Murphy, Kieran A., Yin, Zhuowen, Bassett, Dani S.

arXiv.org Artificial Intelligence

Whether the system under study is a shoal of fish, a collection of neurons, or a set of interacting atmospheric and oceanic processes, transfer entropy measures the flow of information between time series and can detect possible causal relationships. Much like mutual information, transfer entropy is generally reported as a single value summarizing an amount of shared variation, yet a more fine-grained accounting might illuminate much about the processes under study. Here we propose to decompose transfer entropy and localize the bits of variation on both sides of information flow: that of the originating process's past and that of the receiving process's future. We employ the information bottleneck (IB) to compress the time series and identify the transferred entropy. We apply our method to decompose the transfer entropy in several synthetic recurrent processes and an experimental mouse dataset of concurrent behavioral and neural activity. Our approach highlights the nuanced dynamics within information flow, laying a foundation for future explorations into the intricate interplay of temporal processes in complex systems.


Generative Neural Reparameterization for Differentiable PDE-constrained Optimization

Joglekar, Archis S.

arXiv.org Artificial Intelligence

Partial-differential-equation (PDE)-constrained optimization is a well-worn technique for acquiring optimal parameters of systems governed by PDEs. However, this approach is limited to providing a single set of optimal parameters per optimization. Given a differentiable PDE solver, if the free parameters are reparameterized as the output of a neural network, that neural network can be trained to learn a map from a probability distribution to the distribution of optimal parameters. This proves useful in the case where there are many well performing local minima for the PDE. We apply this technique to train a neural network that generates optimal parameters that minimize laser-plasma instabilities relevant to laser fusion and show that the neural network generates many well performing and diverse minima.


Parameterized quantum comb and simpler circuits for reversing unknown qubit-unitary operations

Mo, Yin, Zhang, Lei, Chen, Yu-Ao, Liu, Yingjian, Lin, Tengxiang, Wang, Xin

arXiv.org Artificial Intelligence

In quantum computing, we are capable not only of transforming states but also of transforming processes. Designing quantum circuits to transform input operations has a wide range of applications in quantum computing, quantum information processing, and quantum machine learning. The networks that perform such transformations are known as super-channels [1, 2], which take processes as inputs and output the corresponding transformed process. In general, all these super-channels can be realized with the quantum comb architecture [1, 2]. Figure 1 illustrates an example where a quantum comb takes m quantum operations as input and outputs a target new operation. Quantum comb is widely applied in solving process transformation problems and optimizing the ultimate achievable performance, including transformations of unitary operations such as inversion [3, 4], complex conjugation, control-U analysis [5], as well as learning tasks [6, 7]. It can also be used for analyzing more general processes [8] and has also inspired structures like the indefinite causal network [9, 10]. However, obtaining the explicit quantum circuit required for the desired transformation is a challenging problem. A major problem of the semidefinite programming (SDP) approach based on the Choi-Jamiołkowski isomorphism is that the dimension of the Choi operator of the quantum comb, i.e., the dimension of the variable in such SDP problems, grows exponentially fast with the increase in the number of comb slots. Another issue is that the SDP ultimately returns the Choi operator of the quantum comb; however, finding a physical implementation of this network, such as converting it into a standard circuit model, is not straightforward.


Exciton-Polariton Condensates: A Fourier Neural Operator Approach

Sathujoda, Surya T., Wang, Yuan, Gandhi, Kanishk

arXiv.org Artificial Intelligence

Advancements in semiconductor fabrication over the past decade have catalyzed extensive research into all-optical devices driven by exciton-polariton condensates. Preliminary validations of such devices, including transistors, have shown encouraging results even under ambient conditions. A significant challenge still remains for large scale application however: the lack of a robust solver that can be used to simulate complex nonlinear systems which require an extended period of time to stabilize. Addressing this need, we propose the application of a machine-learning-based Fourier Neural Operator approach to find the solution to the Gross-Pitaevskii equations coupled with extra exciton rate equations. This work marks the first direct application of Neural Operators to an exciton-polariton condensate system. Our findings show that the proposed method can predict final-state solutions to a high degree of accuracy almost 1000 times faster than CUDA-based GPU solvers. Moreover, this paves the way for potential all-optical chip design workflows by integrating experimental data.


Quantum Data Center: Perspectives

Liu, Junyu, Jiang, Liang

arXiv.org Machine Learning

A quantum version of data centers might be significant in the quantum era. In this paper, we introduce Quantum Data Center (QDC), a quantum version of existing classical data centers, with a specific emphasis on combining Quantum Random Access Memory (QRAM) and quantum networks. We argue that QDC will provide significant benefits to customers in terms of efficiency, security, and precision, and will be helpful for quantum computing, communication, and sensing. We investigate potential scientific and business opportunities along this novel research direction through hardware realization and possible specific applications. We show the possible impacts of QDCs in business and science, especially the machine learning and big data industries.